MSGSU
ISTATISTIK BOLUMU - IST 335/s R ILE ISTATISTIKSEL PROGRAMLAMA DERS
NOTLARI by ozge.ozdamar@msgsu.edu.tr is licensed under a
Creative
Commons Attribution-NonCommercial-ShareAlike 4.0 International
License.
Hata ve öneriler için ozge.ozdamar@msgsu.edu.tr
REFERENCES
Links
Cheat Sheets
LIBRARIES
.packages = c("RXKCD","beepr","fortunes","Rcmdr", "fun", "plotly", "magrittr", "networkD3", "dygraphs","devtools")
.inst <- .packages %in% installed.packages()
if(length(.packages[!.inst]) > 0) install.packages(.packages[!.inst])
lapply(.packages, require, character.only=TRUE)
R is a language and environment for statistical computing and graphics, similar to the S language originally developed at Bell Labs
library(plotly)
library(magrittr)
mtcars$am[which(mtcars$am == 0)] <- 'Automatic'
mtcars$am[which(mtcars$am == 1)] <- 'Manual'
mtcars$am <- as.factor(mtcars$am)
p <- plot_ly(mtcars, x = ~wt, y = ~hp, z = ~qsec, color = ~am, colors = c('#BF382A', '#0C4B8E')) %>%
add_markers() %>%
layout(scene = list(xaxis = list(title = 'Weight'),
yaxis = list(title = 'Gross horsepower'),
zaxis = list(title = '1/4 mile time')))
p
library(networkD3)
data(MisLinks, MisNodes)
forceNetwork(Links = MisLinks, Nodes = MisNodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",
Group = "group", opacity = 0.4)
library(dygraphs)
dygraph(nhtemp, main = "New Haven Temperatures") %>%
dyRangeSelector(dateWindow = c("1920-01-01", "1960-01-01"))
Base installation: base, datasets, utils, grDevices, graphics, stats, methods … Packages
Base package contains the basic functions which let R function as a language: arithmetic, input/output, basic programming support, etc.
Packages are collections of R functions, data, and compiled code in a well-defined format.
The directory where packages are stored on your computer is called the library.
library()
https://cran.r-project.org/ > Packages
R Home Page : https://www.r-project.org/ > Bioconductor
Version :
Depends :
Imports : packages listed here must be present for the package to work
Suggests : package can use these packages, but doesn’t require them
Citation :
Reference manual:
Vingette :
Source :
data() # list of data of installed packages
data(mtcars) # load data mtcars from package datasets
mtcars # print data to screen
###Functions
sum(mtcars$mpg)
## [1] 642.9
?sum #help for function
??sum # fuzzy search for help
###Examples
example(mtcars)
##
## mtcars> require(graphics)
##
## mtcars> pairs(mtcars, main = "mtcars data", gap = 1/4)
##
## mtcars> coplot(mpg ~ disp | as.factor(cyl), data = mtcars,
## mtcars+ panel = panel.smooth, rows = 1)
##
## mtcars> ## possibly more meaningful, e.g., for summary() or bivariate plots:
## mtcars> mtcars2 <- within(mtcars, {
## mtcars+ vs <- factor(vs, labels = c("V", "S"))
## mtcars+ am <- factor(am, labels = c("automatic", "manual"))
## mtcars+ cyl <- ordered(cyl)
## mtcars+ gear <- ordered(gear)
## mtcars+ carb <- ordered(carb)
## mtcars+ })
##
## mtcars> summary(mtcars2)
## mpg cyl disp hp drat
## Min. :10.40 4:11 Min. : 71.1 Min. : 52.0 Min. :2.760
## 1st Qu.:15.43 6: 7 1st Qu.:120.8 1st Qu.: 96.5 1st Qu.:3.080
## Median :19.20 8:14 Median :196.3 Median :123.0 Median :3.695
## Mean :20.09 Mean :230.7 Mean :146.7 Mean :3.597
## 3rd Qu.:22.80 3rd Qu.:326.0 3rd Qu.:180.0 3rd Qu.:3.920
## Max. :33.90 Max. :472.0 Max. :335.0 Max. :4.930
## wt qsec vs am gear carb
## Min. :1.513 Min. :14.50 V:18 automatic:19 3:15 1: 7
## 1st Qu.:2.581 1st Qu.:16.89 S:14 manual :13 4:12 2:10
## Median :3.325 Median :17.71 5: 5 3: 3
## Mean :3.217 Mean :17.85 4:10
## 3rd Qu.:3.610 3rd Qu.:18.90 6: 1
## Max. :5.424 Max. :22.90 8: 1
###Demonstrations
demo() # list of demos of installed packages
demo(graphics)
###Vignettes
vignette() # list of vignettes of installed packages
###GUI packages
library(Rcmdr) # Statistics GUI
###Fun :)
library(RXKCD)
searchXKCD("programing ")
getXKCD(353)
library(beepr)
beep(4)
library(fortunes)
fortune()
library(fun)
gomoku()
library(fun)
if (interactive()) {
if (.Platform$OS.type == "windows")
x11() else x11(type = "Xlib")
sliding_puzzle()
sliding_puzzle(z = matrix(0:11, 3, 4))
}
devtools::install_github('RLesur/Rcade')
Rcade::games
Rcade::games$Pacman
Rcade::games$`2048`
| help | Link |
|---|---|
| Microsoft R Open:latest packages | https://mran.microsoft.com/spotlight/ |
| Rdrr.io: Cran + bioconductor + Github + R-Forge | https://rdrr.io/all/ |
| rOpenSci packages | https://ropensci.org/packages/ |
| RDocumentation: CRAN + BioConductor + Github | https://www.rdocumentation.org/ |
| AWESOME R | https://awesome-r.com/ |
| github | https://github.com/qinwf/awesome-R |
| METACRAN | https://www.r-pkg.org/ |
| CRANBERRIES | http://dirk.eddelbuettel.com/cranberries/ |
| BIOCONDUCTOR | https://www.bioconductor.org/packages/release/BiocViews.html |
| RSTUDIO | https://rviews.rstudio.com/ |
| help | Link |
|---|---|
| email lists | https://www.r-project.org/mail.html |
| stackoverflow | https://stackoverflow.com/ |
| R search engine | http://search.r-project.org/ |
| R seek engine | https://rseek.org/ |
| IDE / GUI | Link |
|---|---|
| RSTUDIO | https://www.rstudio.com A powerful and productive user interface for R. Works great on Windows, Mac, and Linux. |
| Revolution R Enterprise | https://mran.microsoft.com Revolution R would be offered free to academic users and commercial software would focus on big data, large scale multiprocessor functionality. |
| EMACS + ESS | http://ess.r-project.org Emacs Speaks Statistics is an add-on package for emacs text editors. |
| IRkernel | https://github.com/IRkernel/IRkernel R kernel for Jupyter. |
| StatET | http://www.walware.de/goto/statet An Eclipse based IDE for R |
| NOTEPAD ++ | https://notepad-plus-plus.org/download/v7.6.3.html |
| SCIVIEWS | http://www.sciviews.org |
| R-BRAIN | https://r-brain.io/en/ |
| RKWARD | https://rkward.kde.org |
| Radiant | https://radiant-rstats.github.io/docs/ A platform-independent browser-based interface for business analytics in R, based on the Shiny |
| RTVS R TOOLS FOR VISUAL STUDIO | https://docs.microsoft.com/en-us/visualstudio/rtvs/?view=vs-2017 |
| TINN-R | https://sourceforge.net/projects/tinn-r/ |
| JGR | https://www.rforge.net/JGR/ |
| R AnalyticFlow | https://r.analyticflow.com/en/ |
| JASP | https://jasp-stats.org A complete package for both Bayesian and Frequentist methods, that is familiar to users of SPSS. |
| Rattle | https://rattle.togaware.com |
| Vim-R | https://github.com/vim-scripts/Vim-R-plugin Vim plugin for R |
| Nvim-R | https://github.com/jalvesaq/Nvim-R Neovim plugin for R |
| Bio7 | https://bio7.org A IDE contains tools for model creation, scientific image analysis and statistical analysis for ecological modelling. |
| R Commender | https://socialsciences.mcmaster.ca/jfox/Misc/Rcmdr/ A package that provides a basic graphical user interface. |
| Deducer | http://www.deducer.org/pmwiki/pmwiki.php?n=Main.DeducerManual?from=Main.HomePage A Menu driven data analysis GUI with a spreadsheet like data editor. |
Tools -> Global Options -> Pane Layout
Command Completion: Tab Command History Popup: Ctrl + arrow up Clear Console : Ctrl + L Go through historical command : arrow up
R has a workspace known as the global environment that can be used to store the results of calculations, and many other types of objects.
get working directory
getwd()
set working directory
setwd(...)
dir()
Save history
savehistory(file="isim.Rhistory")
loadhistory(file="isim.Rhistory")
quit R
q()
your library’s path
# your library's path
.libPaths()
list of packages in your library
library()
details of all packages installed in the specified libraries
installed.packages()
list of functions in a package
library(help = "base")
install packages
install.packages("PackageName") # from CRAN
biocLite(PackageName) # from Bioconductor
Devtolls::install_github(PackageName) # fom Github
load packages
library(PackageName)
require(PackageName)
list of loaded packages
search()
directories of attached (loaded) packages
searchpaths()
update library
update.packages()
5
## [1] 5
5+7
## [1] 12
10.5
## [1] 10.5
56/2
## [1] 28
3*6
## [1] 18
5-10
## [1] -5
3^2
## [1] 9
14/5+3
## [1] 5.8
14/(5+3)
## [1] 1.75
In R, an object is anything that can be assigned to a variable. This includes constants, data structures, functions, and even graphs. Objects have a mode (which describes how the object is stored) and a class (which tells generic functions like print how to handle it).
atama operatoru ters ok isaretidir <-
a <- 5 # my first variable :P
a
## [1] 5
b <- 6*8-4
b
## [1] 44
b <- b*10
case sensitive : A ile a farkli
Turkce karakter kullanilmamali
Rakam ile baslamamali, alfanumerik olabilir : b1, m5
imlecler: _ , .
ingilizce komut olabilecek kelimeleri nesne ismi olarak kullanilmamali (print, scan, c)
degisken isimleri icin mantik gelistiriniz:
Var_name2 : Valid - contains underscores, dot, numbers, or letters.
2var_name : Invalid - Starts with a number
Var_name% : Invalid - Contains the percentage (%) sign.
.2var_name : Invalid - Starts with a dot followed by a number.
.var_name, var.name : Valid - Starts with a dot. Dot should never be followed by a number.
_var_name : Invalid - Starts with an underscore.
In every computer language variables provide a means of accessing the data stored in memory. R does not provide direct access to the computer’s memory but rather provides a number of specialized data structures we will refer to as objects. These objects are referred to through symbols or variables. In R, however, the symbols are themselves objects and can be manipulated in the same way as any other object. This is different from many other languages and has wide ranging effects.
ob1 <- 54
ob2 <- "merhaba"
ob3 <- 'F'
ob4 <- TRUE
ob5 <- F
ob6 <-c(1,4,5) # this is a vector
ob7 <- mtcars
ob8 <- 54L
ob1 # print object to screen
## [1] 54
print(ob1)
## [1] 54
class(ob1)
## [1] "numeric"
mode(ob1)
## [1] "numeric"
typeof(ob1)
## [1] "double"
class(ob8)
## [1] "integer"
mode(ob8)
## [1] "numeric"
typeof(ob8)
## [1] "integer"
str(ob2) #structure of object
## chr "merhaba"
dim(ob2) # dimension of object
## NULL
length(ob2) # length of object
## [1] 1
names(ob2) #names of objects in avaible
## NULL
summary(ob2) #
## Length Class Mode
## 1 character character
is.vector(ob1) # ask object type
## [1] TRUE
as.matrix(ob1) # change object type
## [,1]
## [1,] 54
rm(ob1) #remove object
objects()
## [1] "a" "b" "MisLinks" "MisNodes" "mtcars" "mtcars2"
## [7] "ob2" "ob3" "ob4" "ob5" "ob6" "ob7"
## [13] "ob8" "p"
ls()
## [1] "a" "b" "MisLinks" "MisNodes" "mtcars" "mtcars2"
## [7] "ob2" "ob3" "ob4" "ob5" "ob6" "ob7"
## [13] "ob8" "p"
#remove()
#rm()
| Object | class() |
mode() |
typeof() |
storage.mode() |
|---|---|---|---|---|
| ob1<-54 | numeric | numeric | double | double |
| ob2<-“merhaba” | character | character | character | character |
| ob3<-‘F’ | character | character | character | character |
| ob4<-TRUE | logical | logical | logical | logical |
| ob5<-F | logical | logical | logical | logical |
| ob6<-c(1,4,5) | numeric | numeric | double | double |
| ob7<-mtcars | data.frame | list | list | list |
| ob8<-54L | integer | numeric | integer | integer |
The following table describes the possible values returned by typeof and what they are.
typeof() |
definition |
|---|---|
| “NULL” | NULL |
| “symbol” | a variable name |
| “pairlist” | a pairlist object (mainly internal) |
| “closure” | a function |
| “environment” | an environment |
| “promise” | an object used to implement lazy evaluation |
| “language” | an R language construct |
| “special” | an internal function that does not evaluate its arguments |
| “builtin” | an internal function that evaluates its arguments |
| “char” | a ‘scalar’ string object (internal only) *** |
| “logical” | a vector containing logical values |
| “integer” | a vector containing integer values |
| “double” | a vector containing real values |
| “complex” | a vector containing complex values |
| “character” | a vector containing character values |
| “…” | the special variable length argument *** |
| “any” | a special type that matches all types: there are no objects of this type |
| “expression” | an expression object |
| “list” | a list |
| “bytecode” | byte code (internal only) *** |
| “externalptr” | an external pointer object |
| “weakref” | a weak reference object |
| “raw” | a vector containing bytes |
| “S4” | an S4 object which is not a simple object |
v1 <- c(3,1,TRUE,2+3i)
v2 <<- c(3,1,TRUE,2+3i)
v3 = c(3,1,TRUE,2+3i)
v3
## [1] 3+0i 1+0i 1+0i 2+3i
15-> v4
v4
## [1] 15
http://stat.ethz.ch/R-manual/R-patched/library/base/html/assignOps.html
There are three different assignment operators: two of them have leftwards and rightwards forms.
The operators <- and = assign into the environment in which they are evaluated. The operator <- can be used anywhere, whereas the operator = is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.
The operators <<- and ->> are normally only used in functions, and cause a search to be made through parent environments for an existing definition of the variable being assigned. If such a variable is found (and its binding is not locked) then its value is redefined, otherwise assignment takes place in the global environment. Note that their semantics differ from that in the S language, but are useful in conjunction with the scoping rules of R. See ‘The R Language Definition’ manual for further details and examples.
In all the assignment operator expressions, x can be a name or an expression defining a part of an object to be replaced (e.g., z[[1]]). A syntactic name does not need to be quoted, though it can be (preferably by backticks).
The leftwards forms of assignment <- = <<- group right to left, the other from left to rig
These are applicable to vectors that are known to be complex, numeric, or, logical. The value TRUE will be given to anything that is larger than 1. You will then compare the first element of the vector with the second vector’s corresponding element.
| Table | Second Header |
|---|---|
| AND | “&” |
| OR | ” |
| NOT | “!” |
| Logical AND | “&&” |
| Logical OR | ” |
Operators & and | perform element-wise
operation producing result having length of the longer operand. But
&& and || examines only the first
element of the operands resulting into a single length logical
vector.
Zero is considered FALSE and non-zero numbers are taken as TRUE.
&
: ANDIt is called Element-wise Logical AND operator. It combines each element of the first vector with the corresponding element of the second vector and gives a output TRUE if both the elements are TRUE.
x1<-c(1,2,3,4)
x1
## [1] 1 2 3 4
x2<-c(10,20,30,40)
x2
## [1] 10 20 30 40
x1&x2
## [1] TRUE TRUE TRUE TRUE
v <- c(3, 1, TRUE, 2+3i)
v
## [1] 3+0i 1+0i 1+0i 2+3i
t <- c(4, 1, FALSE, 2+3i)
t
## [1] 4+0i 1+0i 0+0i 2+3i
v&t
## [1] TRUE TRUE FALSE TRUE
| :
ORIt is called Element-wise Logical OR operator. It combines each element of the first vector with the corresponding element of the second vector and gives a output TRUE if one the elements is TRUE.
x1|x2
## [1] TRUE TRUE TRUE TRUE
v|t
## [1] TRUE TRUE TRUE TRUE
! :
NOTIt is called Logical NOT operator. Takes each element of the vector and gives the opposite logical value.
!x1
## [1] FALSE FALSE FALSE FALSE
!x2
## [1] FALSE FALSE FALSE FALSE
!v
## [1] FALSE FALSE FALSE FALSE
!t
## [1] FALSE FALSE TRUE FALSE
&& : Logical ANDCalled Logical AND operator. Takes first element of both the vectors and gives the TRUE only if both are TRUE.
x1&&x2
## Warning in x1 && x2: 'length(x) = 4 > 1' in coercion to 'logical(1)'
## Warning in x1 && x2: 'length(x) = 4 > 1' in coercion to 'logical(1)'
## [1] TRUE
v&&t
## Warning in v && t: 'length(x) = 4 > 1' in coercion to 'logical(1)'
## Warning in v && t: 'length(x) = 4 > 1' in coercion to 'logical(1)'
## [1] TRUE
|| :
Logical ORCalled Logical OR operator. Takes first element of both the vectors and gives the TRUE if one of them is TRUE.
x1||x2
## Warning in x1 || x2: 'length(x) = 4 > 1' in coercion to 'logical(1)'
## [1] TRUE
v||t
## Warning in v || t: 'length(x) = 4 > 1' in coercion to 'logical(1)'
## [1] TRUE
Relational Operators are known to make comparisons between the first and second elements. With that, you’’ll get Boolean values as results.
| Operator | Meaning |
|---|---|
| 1. < | Less than |
| 2. > | Greater than |
| 3. <= | Less than or equal to |
| 4. >= | Greater than or equal to |
| 5. == | Equal to |
| 6. != | Not equal to |
v <- c(2, 5.5, 6, 9)
t <- c(8, 2.5, 14, 9)
v < t
## [1] TRUE FALSE TRUE FALSE
v > t
## [1] FALSE TRUE FALSE FALSE
v <= t
## [1] TRUE FALSE TRUE TRUE
v >= t
## [1] FALSE TRUE FALSE TRUE
v == t
## [1] FALSE FALSE FALSE TRUE
v != t
## [1] TRUE TRUE TRUE FALSE
Arithmetic Operators are used to carry out mathematical operations
| Operator | Meaning |
|---|---|
| 1. + | Addition |
| 2. - | Subtraction |
| 3. * | Multiplication |
| 4. / | Division |
| 5. ^ | Exponent |
| 6. %% | Modulus (Remainder from division) |
| 7. %/% | Integer Division |
v <- c(2, 5.5, 6)
t <- c(8, 3, 4)
v + t
## [1] 10.0 8.5 10.0
v - t
## [1] -6.0 2.5 2.0
v * t
## [1] 16.0 16.5 24.0
v / t
## [1] 0.250000 1.833333 1.500000
v ^ t
## [1] 256.000 166.375 1296.000
v %% t
## [1] 2.0 2.5 2.0
v %/% t
## [1] 0 1 1
These operators are used to for specific purpose and not general mathematical or logical computation.
| operator | meaning |
|---|---|
| : | Colon operator. It creates the series of numbers in sequence for a vector. |
| %in% | This operator is used to identify if an element belongs to a vector. |
| %*% | This operator is used to multiply matrices |
v1 <- 8
v2 <- 12
t <- 1:10
v1 %in% t
## [1] TRUE
v2 %in% t
## [1] FALSE
M = matrix(1:6, nrow = 2,ncol = 3,byrow = TRUE)
t = M %*% t(M)
t
## [,1] [,2]
## [1,] 14 32
## [2,] 32 77
getwd()
setwd()
ls()
rm()
help(options)
options()
history()
savehistory()
load(history)
save.image()
save(...)
load()
q
help.start()
help(sum) # or
?sum
help.search(sum) # or
??sum
example(sum)
RSiteSearch("sum")
apropos("sum", mode="function")
data()
demo()
vignette()
vignette("sum")
NA : Not Available
NaN : Not a Number
Inf : Infinity
-Inf : Negative Infinity
NAIn R, the NA values (Not available) are used to represent missing values.
x1<-NA
x1
## [1] NA
is.na(x1) # returns TRUE of x is missing
## [1] TRUE
is.nan(x1)
## [1] FALSE
is.finite(x1)
## [1] FALSE
is.infinite(x1)
## [1] FALSE
class(x1)
## [1] "logical"
NANSometimes, a computation will produce a result that makes little sense. In these cases, R will often return NaN (meaning “not a number”)
x2<-0/0
x2
## [1] NaN
is.na(x2)
## [1] TRUE
is.nan(x2)
## [1] TRUE
is.finite(x2)
## [1] FALSE
is.infinite(x2)
## [1] FALSE
class(x2)
## [1] "numeric"
Inf - Inf
## [1] NaN
InfIf a computation results in a number that is too big, R will return Inf for a positive number and -Inf for a negative number.This is also the value returned when you divide by 0:
x3a<-10/0
x3a
## [1] Inf
is.na(x3a)
## [1] FALSE
is.nan(x3a)
## [1] FALSE
is.finite(x3a)
## [1] FALSE
is.infinite(x3a)
## [1] TRUE
class(x3a)
## [1] "numeric"
2 ^ 1024
## [1] Inf
x3b <- -10/0
x3b
## [1] -Inf
is.na(x3b)
## [1] FALSE
is.nan(x3b)
## [1] FALSE
is.finite(x3b)
## [1] FALSE
is.infinite(x3b)
## [1] TRUE
class(x3b)
## [1] "numeric"
-2 ^ 1024
## [1] -Inf
NULL is often used to explicitly define an “empty” entity, which is quite different from a “missing” entity specified with NA.
NULL is often used as an argument in functions to mean that no value was assigned to the argument. Additionally, some functions may return NULL.
x1<-NULL
x1
## NULL
x2<-NA # NA is with index number
x2
## [1] NA
x1<-c(2,5,NA,6)
x1
## [1] 2 5 NA 6
length(x1)
## [1] 4
x2<-c(2,5,NULL,6)
x2
## [1] 2 5 6
length(x2)
## [1] 3
NULL+5
## numeric(0)
NA +5
## [1] NA
5<=NULL
## logical(0)